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1 Fast detection of communication patterns in distributed executions j 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 

Studies on Collaborative research CASCON '97 
Publisher: IBM Press 

Full text available: ^pdf(4.21 MB) Additional Information: fu ll citation , abstract, references, index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the execution 
of the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not 
provide the user with the desired overview of the application. In our experience, such tools 
display repeated occurrences of non-trivial commun 



Temporal sequence learning and data reduction for anomaly detection 
Terran Lane, Carla E. Brodley 

August 1999 ACM Transactions oh Information and System Security (TISSEC), Volume 2 

Issue 3 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available:' 



The anomaly-detection problem can be formulated as one of learning to characterize the 
behaviors of an individual, system, or network in terms of temporal sequences of discrete 
data. We present an approach on the basis of instance-based learning (IBL) techniques. To 
cast the anomaly-detection task in an IBL framework, we employ an approach that 
transforms temporal sequences of discrete, unordered observations into a metric space via 
a similarity measure that encodes intra-attribute depende ... 

Keywords: anomaly detection, clustering, data reduction, empirical evaluation, instance 
based learning, machine learning, user profiling 



Data clustering: a review 

A. K. Jain, M. N. Murty, P. J. Flynn 

September 1999 ACM Computing Surveys (CSUR), volume 31 issue 3 
Publisher: ACM Press 

Additional Information: full citation, abstract, references , citin gs, index 
terms , review 



Full text available: fiB pdf(636.24 KB) 



Clustering is the unsupervised classification of patterns (observations, data items, or 
feature vectors) into groups (clusters). The clustering problem has been addressed in 
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many contexts and by researchers in many disciplines; this reflects its broad appeal and 
usefulness as one of the steps in exploratory data analysis. However, clustering is a 
difficult problem combinatorially, and differences in assumptions and contexts in different 
communities has made the transfer of useful generic co ... 

Keywords: cluster analysis, clustering applications, exploratory data analysis, incremental 
clustering, similarity indices, unsupervised learning 
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Image retrieval, users and usability: A search engine for historical manuscript images 
Toni M. Rath, R. Manmatha, Victor Lavrenko 

July 2004 Proceedings of the 27th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '04 

Publisher: ACM Press 

Full text available* fSl odf(346 95 KB) Additiona l Information: fu ll c itation, abs t r ac t, references , citings, index 
*^ terms 

Many museum and library archives are digitizing their large collections of handwritten 
historical manuscripts to enable public access to them. These collections are only available 
in image formats and require expensive manual annotation work for access to them. 
Current handwriting recognizers have word error rates in excess of 50% and therefore 
cannot be used for such material. We describe two statistical models for retrieval in large 
collections of handwritten manuscripts given a text query. Bo ... 

Keywords: handwriting retrieval, historical manuscripts, relevance models 



5 Automatic summarization of music videos 

Xi Shao, Changsheng Xu, Namunu C. Maddage, Qi Tian, Mohan S. Kankanhalli, Jesse S. Jin 
^ May 2006 ACM Transactions on Multimedia Computing, Communications, and 
Applications (TOMCCAP), Volume 2 Issue 2 

Publisher: ACM Press 

Full text available: ^ pdf(488.75 KB) Additional Information: full citation , abstract , references , index terms 

In this article, we propose a novel approach for automatic music video summarization. The 
proposed summarization scheme is different from the current methods used for video 
summarization. The music video is separated into the music track and video track. For the 
music track, a music summary is created by analyzing the music content using music 
features, an adaptive clustering algorithm, and music domain knowledge. Then, shots in 
the video track are detected and clustered. Finally, the music vide ... 

Keywords: Music summarization, music video, video summarization 



6 Content session 5: image annotation: Real-time computerized anno tation of pictures j| 
Jia Li, James Z. Wang 

>^ October 2006 Proceedings of the 14th annual ACM international conference on 
Multimedia MULTIMEDIA '06 

Publisher: ACM Press 

Full text available: pdf(1 .59 MB) Additional Information: full citation, abstract, references, index terms 

Automated annotation of digital pictures has been a highly challenging problem for 
computer scientists since the invention of computers. The capability of annotating pictures 
by computers can lead to breakthroughs in a wide range of applications including Web 
image search, online picture-sharing communities, and scientific experiments. In our work, 
by advancing statistical modeling and optimization techniques, we can train computers 
about hundreds of semantic concepts using example pictures from ... 

Keywords: clustering, image annotation, modeling, statistical learning 
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7 Research papers mining biological and medical data: Subseque nce matchin g on ||| 
structured time series data 

Huanmei Wu, Betty Salzberg, Gregory C Sharp, Steve B Jiang, Hlroki Shirato, David Kaeli 
June 2005 Proceedings of the 2005 ACM SIGMOD international conference on 

Management of data SIGMOD '05 
Publisher: ACM Press 

Full text available: f£| pdf(930,08 KB) Additional Information: full citation , abstract , references 



Subsequence matching in time series databases is a useful technique, with applications in 
pattern matching, prediction, and rule discovery. Internal structure within the time series 
data can be used to improve these tasks, and provide important insight into the problem 
domain. This paper introduces our research effort in using the internal structure of a time 
series directly in the matching process. This idea is applied to the problem domain of 
respiratory motion data in cancer radiation treatme ... 

Three-dimensional object recognition 
Paul J. Besl, Ramesh C. Jain 

March 1985 ACM Computing Surveys (CSUR), volume 17 issue l 
Publisher: ACM Press 

Full text available- S p df(7 76 MB) Additional Information: full citation , abstract , references , citings , index 

terms , review 

A general-purpose computer vision system must be capable of recognizing three- 
dimensional (3-D) objects. This paper proposes a precise definition of the 3-D object 
recognition problem, discusses basic concepts associated with this problem, and reviews 
the relevant literature. Because range images (or depth maps) are often used as sensor 
input instead of intensity images, techniques for obtaining, processing, and characterizing 
range data are also surveyed. 

Video abstraction: A systematic review and classification 
Ba Tu Truong, Svetha Venkatesh 

February 2007 ACM Transactions on Multimedia Computing, Communications, and 

Applications (TOMCCAP), Volume 3 issue l 
Publisher: ACM Press 

Full text available- Wi pdf(380 77 KB) Additiona ' Information: full citation , a ppendices and supplements , 

: abstra ct, references, index terms 

The demand for various multimedia applications is rapidly increasing due to the recent 
advance in the computing and network infrastructure, together with the widespread use of 
digital video technology. Among the key elements for the success of these applications is 
how to effectively and efficiently manage and store a huge amount of audio visual, 
information, while at the same time providing user-friendly access to the stored data. This 
has fueled a quickly evolving research area known as vide ... 

Keywords: Video summarization, keyframe, survey, video abstraction, video skimming 



10 Multiple sensor integ ration for indoor surveillance 
A\ Valery A. Petrushin, Gang Wei, Rayid Ghani, Anatole V. Gershman 

August 2005 Proceedings of the 6th international workshop on Multimedia data 
mining: mining integrated media and complex data MDM '05 

Publisher: ACM Press 

Full text available: pdf(663.64 KB) Additional information: full citation , abstract , references , index terms 

Multiple Sensor Indoor Surveillance (MSIS) is a research project at Accenture Technology 
Labs aimed at exploring a variety of redundant sensors in a networked environment where 
each sensor is giving noisy information and the goal is to coherently reason about some 
aspect of the environment. We describe the objectives of the project, the problems it was 
designed to solve and some recent results. The environment includes 32 web cameras, an 
infrared badge ID system, a PTZ camera, and a fingerprint ... 
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Keywords: Bayesian inference, indoor surveillance, multi-camera surveillance, people 
localization, rare event detection, self-organizing maps, unsupervised learning, 
visualization 



11 Inverted files for text search en g ines 
(fi> Just ' n Zobel, Alistair Moffat 

V July 2006 ACM Computing Surveys (CSUR), volume 38 issue 2 
Publisher: ACM Press 

Full text available: ^ pdf(944.29 KB) Additional Information: full citation , abstract , references , index terms 

The technology underlying text search engines has advanced dramatically in the past 
decade. The development of a family of new index representations has led to a wide range 
of innovations in index storage, index construction, and query evaluation. While some of 
these developments have been consolidated in textbooks, many specific techniques are 
not widely known or the textbook descriptions are out of date. In this tutorial, we 
introduce the key techniques in the area, describing both a core impl ... 

Keywords: Inverted file indexing, Web search engine, document database, information 
retrieval, text retrieval 




12 Special session 2: multimedia information retrieval: challenges and real-world 
appli c ations : Cont e n t- based image re t riev al: app roaches a nd trends of the new a ge 
Ritendra Datta, Jia Li, James Z. Wang 

November 2005 Proceedings of the 7th ACM SIGMM international workshop on 

Multimedia information retrieval MIR '05 
Publisher: ACM Press 

^ ii i ■> . , fit AtiAK-, r> A ixov Additional Information: full citation , abstract , references , citings , index 

Full text available: Tq pdf(467.64 KB) — — — 

^ terms 

The last decade has witnessed great interest in research on content-based image retrieval. 
This has paved the way for a large number of new techniques and systems, and a growing 
interest in associated fields to support such systems. Likewise, digital imagery has 
expanded its horizon in many directions, resulting in an explosion in the volume of image 
data required to be organized. In this paper, we discuss some of the key contributions in 
the current decade related to image retrieval and automat ... 

Keywords: annotation, content-based image retrieval 



13 Temporal sequence learning and data reduction for anomaly detection 

Terran Lane, Carla E. Brodley 
S/ November 1998 Proceedings of the 5th ACM conference on Computer and 
communications security CCS '98 

Publisher: ACM Press 

Full text available: ^pdf(1. 12- M B) Additional Information: full cit atio n, r eferenc es, citings , index terms 



14 Industry track papers: ADMIT: anomaly-based data mining for intrusions 
Karlton Sequeira, Mohammed Zaki 

July 2002 Proceedings of the eighth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '02 

Publisher: ACM Press 

Full text available" IB odfd 33 MB) Additional Information: full citation , abstract , references , citings, index 

terms 

Security of computer systems is essential to their acceptance and utility. Computer 
security analysts use intrusion detection systems to assist them in maintaining computer 
system security. This paper deals with the problem of differentiating between 
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masqueraders and the true user of a computer terminal. Prior efficient solutions are less 
suited to real time application, often requiring all training data to be labeled, and do not 
inherently provide an intuitive idea of what the data model means. ... 

15 Large-scale collections: Organizing the OCA: learning faceted subjects from a library 

(|> of d i gi ta l book s 

^ David Mimno, Andrew McCallum 

June 2007 Proceedings of the 2007 conference on Digital libraries JCDL '07 

Publisher: ACM Press 

Full text available: pdf(5 37,50 KB) Additional Information: full citation, abstract, references,. index terms 

Large scale library digitization projects such as the Open Content Alliance are producing 
vast quantities of text, but little has been done to organize this data. Subject headings 
inherited from card catalogs are useful but limited, while full-text indexing is most 
appropriate for readers who already know exactly what they want. Statistical topic models 
provide a complementary function. These models can identify semantically coherent 
"topics" that are easily recognizable and meaningful to hum ... 




Keywords: classification, topic models 



16 Ex ploitin g perception in hi g h-fidelity virtual environments: Exploiting perception in 
& high-fidelity virtual environm ent s 

Additional presentations from the 24th course are available on the citation 

page 

Mashhuda Glencross, Alan G. Chalmers, Ming C. Lin, Miguel A. Otaduy, Diego Gutierrez 
July 2006 ACM SIGGRAPH 2006 Courses SIGGRAPH '06 
Publisher: ACM Press 

Full text available:^ pdf(5.07 MB) Q Additional Information: fu l l cita ti on , appendices and sup plements . 

mov(68:6 MINI) abstract , references , cited by . index terms 

The objective of this course is to provide an introduction to the issues that must be 
considered when building high-fidelity 3D engaging shared virtual environments. The 
principles of human perception guide important development of algorithms and techniques 
in collaboration, graphical, auditory, and haptic rendering. We aim to show how human 
perception is exploited to achieve realism in high fidelity environments within the 
constraints of available finite computational resources. In this course w ... 

Keywords: collaborative environments, haptics, high-fidelity rendering, human-computer 
interaction, multi-user, networked applications, perception, virtual reality 




1 7 Link aggregation: Untangling comp ound d o cuments on the web 
Nadav Eiron, Kevin S. McCurley 

August 2003 Proceedings of the fourteenth ACM conference on Hypertext and 
hypermedia HYPERTEXT '03 

Publisher: ACM Press 

Full text available- Hi pdf(192 59 KB) Additional Information: full citation , abstract, references, citings, index 
i£j - — terms 

Most text analysis is designed to deal with the concept of a "document", namely a 
cohesive presentation of thought on a unifying subject. By contrast, individual nodes on 
the World Wide Web tend to have a much smaller granularity than text documents. We 
claim that the notions of "document" and. "web node" are not synonymous, and that 
authors often tend to deploy documents as collections of URLs, which we call "compound 
documents". In this paper we present new techniques for identifying and workin ... 

Keywords: composites, hypertext, information retrieval, semantic web, wasted space 
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18 Research track poste rs: Diag nosing extrapolation: tree-based density estimation 
|k Giles Hooker 

V August 2004 Proceedings of the tenth ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '04 

Publisher: ACM Press 

Full text available: ffi pdf(378.85 KB) Additional Information: full citation, abstract , references , index terms 




There has historically been very little concern with extrapolation in Machine Learning, yet 
extrapolation can be critical to diagnose. Predictor functions are almost always learned on 
a set of highly correlated data comprising a very small segment of predictor space. 
Moreover, flexible predictors, by their very nature, are not controlled at points of 
extrapolation. This becomes a problem for diagnostic tools that require evaluation on a 
product distribution. It is also an issue when we are tryin ... 

Keywords: C4.5, CART, clustering, density estimation, diagnostics, extrapolation, 
interpretation, modeling methodologies, trees-based models, visualization 

19 Reports from related meetings: Interface '99: a data mining overview Q 

A. Arnold Goodman 

^ January 2000 ACM SIGKDD Explorations Newsletter volume l issue 2 
Publisher: ACM Press 

Full text available: ^ pdf(851 .62 KB) Additional Information: full c itatio n , ab st r act , refe rences 

This personal overview of Interface '99 is intended to communicate its meaning and 
relevance to SIGKDD, as well as provide valuable information on trends within the 
Interface for data miners seeking to learn more about statistics. In addition, it is the 
newest link in a bridge between the Interface and KDD begun by References 2-4 and the 
sessions on KDD at Interface '98 and Interface '99. 

Keywords: review of Interface'99 conference, statistics 



20 Building and using cultural digital libraries: Supporting access to large digital oral jjjj 
A history archives 

^ Samuel Gustman, Dagobert Soergel, Douglas Oard, William Byrne, Michael Picheny, Bhuvana 
Ramabhadran, Douglas Greenberg 

July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 
JCDL '02 

Publisher: ACM Press 



This paper describes our experience with the creation, indexing, and provision of access to 
a very large archive of videotaped oral histories - 116,000 hours of digitized interviews in 
32 languages from 52,000 survivors, liberators, rescuers, and witnesses of the Nazi 
Holocaust. It goes on to identify a set of critical research issues that must be addressed if 
we are to provide full and detailed access to collections of this size: issues in user 
requirement studies, automatic speech recognition, ... 

Keywords: cataloging, oral history, research agenda 
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21 iDistance: An adaptive B-tree bas ed indexin g method f or nearest neighbor search Q 

H. V. Jagadish, Beng Chin Ooi, Kian-Lee Tan, Cui Yu, Rui Zhang 
June 2005 ACM Transactions on Database Systems (TODS), volume 30 issue 2 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: gpdfd.16 MB) 



In this article, we present an efficient B &p,us; -tree based indexing method, called iDistance, 
for K-nearest neighbor (KNN) search in a high-dimensional metric space. iDistance 
partitions the data based on a space- or data-partitioning strategy, and selects a reference 
point for each partition. The data points in each partition are transformed into a single 
dimensional value based on their similarity with respect to the reference point. This allows 
the points to be indexed using a B 
Keyw ord s ! Index i ng, KNN, nearest ne i ghbor qucrica 



22 The elements of nature: interactive and realistic techniques 
Oliver Deusen, David S. Ebert, Ron Fedkiw, F. Kenton Musgrave, Przemyslaw Prusinkiewicz, 
Doug Roble, Jos Stam, Jerry Tessendorf 

August 2004 ACM SIGGRAPH 2004 Course Notes SIGGRAPH '04 

Publisher: ACM Press 

Full text available: |||pdf(17.65 MB) Additional Information: full citation, abstract 

This updated course on simulating natural phenomena will cover the latest research and 
production techniques for simulating most of the elements of nature. The presenters will 
provide movie production, interactive simulation, and research perspectives on the difficult 
task of photorealistic modeling, rendering, and animation of natural phenomena. The 
course offers a nice balance of the latest interactive graphics hardware-based simulation 
techniques and the latest physics-based simulation techni ... 

23 Learning to Detect and Class ify Malicious Executables in the Wild 
J. Zico Kolter, Marcus A. Maloof 

December 2006 The Journal of Machine Learning Research, Volume 7 
Publisher: MIT Press 

Full text available: pdf(242.79 KB) Additional Information: full citation , abstract 

We describe the use of machine learning and data mining to detect and classify malicious 
executables as they appear in the wild. We gathered 1,971 benign and 1,651 malicious 
executables and encoded each as a training example using n-grams of byte codes as 
features. Such processing resulted in more than 255 million distinct n-grams. After 
selecting the most relevant n-grams for prediction, we evaluated a variety of inductive 
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24 A machine transliteration model ba sed on correspondence b etween gra phemes and Q 

^ phonemes 

Jong-Hoon Oh, Key-Sun Choi, Hitoshi Isahara 

September 2006 ACM Transactions on Asian Language Information Processing 

(TALIP), Volume 5 Issue 3 
Publisher: ACM Press 

Full text available: pdf(44 5.63 KB ) Additional Information: full citation, abstract, references, index terms 

Machine transliteration is an automatic method for converting words in one language into 
phonetically equivalent ones in another language. There has been growing interest in the 
use of machine transliteration to assist machine translation and information retrieval. 
Three types of machine transliteration models— grapheme-based, phoneme-based, and 
hybrid— have been proposed. Surprisingly, there have been few reports of efforts to utilize 
the correspondence between source graphemes and source pho ... 

Keywords: Machine transliteration, grapheme and phoneme, information retrieval, 
machine translation, natural language processing 



25 Data mining: Time-dependent semantic similarity measure of queries using historical Q 
click-through data 

Qiankun Zhao, Steven C. H. Hoi, Tie-Yan Liu, Sourav S. Bhowmick, Michael R. Lyu, Wei-Ying 
Ma 

May 2006 Proceedings of the 15th international conference on World Wide Web 
WWW '06 

Publisher: ACM Press 

Full text available* odf(347 70 KB) Add ' t ' onal Information: full citation, abstract, references , cited by , index 

: " terms 

It has become a promising direction to measure similarity of Web search queries by mining 
the increasing amount of click-through data logged by Web search engines, which record 
the interactions between users and the search engines. Most existing approaches employ 
the click-through data for similarity measure of queries with little consideration of the 
temporal factor, while the click-through data is often dynamic and contains rich temporal 
information. In this paper we present a new framework of ... 

Keywords: click-through data, event detection, evolution pattern, marginalized kernel, 
semantic similarity measure 



26 A hierarchical access control model for video database systems 

Elisa Bertino, jianping Fan, Elena Ferrari, Mohand-Said Hacid, Ahmed K. Elmagarmid, 
Xingquan Zhu 

April 2003 ACM Transactions on Information Systems (TOIS), Volume 21 issue 2 
Publisher: ACM Press 

i- ii* * i ui 0 n7 fc .„. Additional Information; full citation, abstract, re ferences , citings, index 

Full text available: Tfl pdf(6.27 MB) 

^ terms 

Content-based video database access control is becoming very important, but it depends 
on the progresses of the following related research issues: (a) efficient video analysis for 
supporting semantic visual concept representation; (b) effective video database indexing 
structure; (c) the development of suitable video database models; and (d) the 
development of access control models tailored to the characteristics of video data. In this 
paper, we propose a novel approach to support multilevel acce ... 

Keywords: Video database models, access control, indexing schemes 
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Visualizing geospatial data 

Theresa Marie Rhyne, Alan MacEachren, Theresa-Marie Rhyne 
August 2004 ACM SIGGRAPH 2004 Course Notes SIGGRAPH '04 
Publisher: ACM Press 

Full text available: ^] pdf(14.01 MB) Additional Information: full citation , abstract 

This course reviews concepts and highlights new directions in GeoVisualization. We review 
four levels of integrating geospatial data and geographic information systems (GIS) with 
scientific and information visualization (VIS) methods. These include:* Rudimentary: 
minimal data sharing between the GIS and Vis systems* Operational: consistency of 
geospatial data* Functional: transparent communication between the GIS and Vis 
systems* Merged: one comprehensive toolkit environmentW ... 

28 Research sessions: stream management: Online event-driven subsequence 
<||> matching over financial data streams 

^ Huanmei Wu, Betty Salzberg, Donghui Zhang 

June 2004 Proceedings of the 2004 ACM SIGMOD international conference on 

Management of data SIGMOD '04 
Publisher: ACM Press 

Full text available: Q pdf(753.59 KB) Additional Information: full citation , abstract , references , citings 

Subsequence similarity matching in time series databases is an important research area 
for many applications. This paper presents a new approximate approach for automatic 
online subsequence similarity matching over massive data streams. With a simultaneous 
on-line segmentation and pruning algorithm over the incoming stream, the resulting 
piecewise linear representation of the data stream features high sensitivity and accuracy. 
The similarity definition is based on a permutation followed by a met ... 

29 Compiler construction: an advanced course 

F. L. Bauer, F. L. De Remer, M. Griffiths, U. Hill, J. J. Horning, C. H. A. Koster, W. M. 
McKeeman, P. C. Poole, W. M. Waite, G. Goos, J. Hartmanis 
January 1974 Book 

Publisher: Springer-Verlag New York, Inc. 

Additional Information: full citation , abstract , references, cited by 

The Advanced Course took place from March 4 to 15, 1974 and was organized by the 
Mathematical Institute of the Technical University of Munich and the Leibniz Computing 
Center of the Bavarian Academy of Sciences, in co-operation with the European 
Communities, sponsored by the Ministry for Research and Technology of the Federal 
Republic of Germany and by the European Research Office, London. 



30 Posters: Content-based ima g e retrieval by clusterin g 
Yixin Chen, James Z. Wang, Robert Krovetz 

November 2003 Proceedings of the 5th ACM SIGMM international workshop on 
Multimedia information retrieval MIR '03 

Publisher: ACM Press 

Full text available: f| pdf(6S 8.35 KB ) Additional information: fullcKation , abstract, references, citings, index 

In a typical content-based image retrieval (CBIR) system, query results are a set of 
images sorted by feature similarities with respect to the query. However, images with high 
feature similarities to the query may be very different from the query in terms of 
semantics. This is known as. the semantic gap. We introduce a novel image retrieval 
scheme, CLUster-based rEtrieval of images by unsupervised learning (CLUE), which 
tackles the semantic gap problem based on a hypothesis: semantically simil ... 

Keywords: content-based image retrieval, image classification, spectral graph clustering, 
unsupervised learning 
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31 Course 17: Spatial augmented reality: merging real and virtual worlds: Modern 
^ approaches to augmented reality 

^ Video files associated with this course are available from the citation page 

Oliver Bimber, Ramesh Raskar 

August 2007 ACM SIGGRAPH 2007 courses SIGGRAPH '07 
Publisher: ACM Press 

Full text available: f S pd«46.17 MB ) Additional Information: full citation, appendices and supplements, 

abstract , references , index terms 

This tutorial discusses the Spatial Augmented Reality (SAR) concept, its advantages and 
limitations. It will present examples of state-of-the-art display configurations, appropriate 
real-time rendering techniques, details about hardware and software implementations, and 
current areas of application. Specifically, it will describe techniques for optical combination 
using single/multiple spatially aligned mirror-beam splitters, image sources, transparent 
screens and optical holograms. Furthermo ... 

32 Towards a road ma p on human language te chn ology: natural lan gua ge processing 
Andreas Eisele, Dorothea Ziegler-Eisele 

August 2002 COLING-02 on A roadmap for computational linguistics - Volume 13 
Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(270,05 KB) Additional Information: full citation , abstract, references 

This document summarizes contributions and discussions from two workshops that took 
place in November 2000 and July 2001. It presents some visions of NLP-related 
applications that may become reality within ten years from now. It investigates the 
technological requirements that must be met in order to make these visions realistic and 
sketches milestones that may help to measure our progress towards these goals. 

33 Information retrieva l on the web 
Mei Kobayashi, Koichi Takeda 

June 2000 ACM Computing Surveys (CSUR), Volume 32 issue 2 
Publisher: ACM Press 

Full text available* fiSl pdf(21 3 89 KB) Add ' tional Information: ful l c i t a t ion, abstract, references , citings , index 
IS- 2 — : terms 

In this paper we review studies of the growth of the Internet and technologies that are 
useful for information search and retrieval on the Web. We present data on the Internet 
from several different sources, e.g., current as well as projected number of users, hosts, 
and Web sites. Although numerical figures vary, overall trends cited by the sources are 
consistent and point to exponential growth in the past and in the coming decade. Hence it 
is not surprising that about 85% of Internet user ... 

Keywords: Internet, World Wide Web, clustering, indexing, information retrieval, 
knowledge management, search engine 



34 Semantic clustering and querying on heterogeneous features for visual data 
Gholamhosein Sheikholeslami, Wendy Chang, Aidong Zhang 

September 1998 Proceedings of the sixth ACM international conference on Multimedia 
MULTIMEDIA '98 

Publisher: ACM Press 

Full text available: pdf(1.37 MB) Additional Information: full citation, references , citings , index terms 



35 Image I: Retrieval of 3D objects by visual similarity 
Jurgen Assfalg, Alberto Del Bimbo, Pietro Pala 

October 2004 Proceedings of the 6th ACM SIGMM international workshop on 
Multimedia information retrieval MIR '04 
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Publisher: ACM Press 

Full text available* f 51 ) pdf{166 89 KB) Add'*' 0031 Information: full citation , abstract , references , citings , index 
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Along with images and videos, 3D models have recently gained increasing attention for a 
number of reasons: advancements in 3D hardware and software technologies, their ever 
decreasing prices and increasing availability, affordable 3D authoring tools, and the 
establishment of open standards for 3D data interchange. The ever increasing availability 
of 3D models demands for tools supporting their effective and efficient management. 
Among these tools, those enabling content-based retrieval play a ... 

Keywords: 3D models, content-based retrieval, spin images 



36 Using dual cascading learning frameworks for image indexing 
Joo-Hwee Lim, Jesse S. Jin 

June 2004 Proceedings of the Pan-Sydney area workshop on Visual information 
processing VIP '05 

Publisher: Australian Computer Society, Inc. 

Full text available: ^] pd f( 25 4 67 K B) Additional Information: full citation, abstract, ref erences , index terms 

To bridge the semantic gap in content-based image retrieval, detecting meaningful visual 
entities (e.g. faces, sky, foliage, buildings etc) in image content and classifying images 
into semantic categories based on trained pattern classifiers have become active research 
trends. In this paper, we present dual cascading learning frameworks that extract and 
combine intra-image and inter-class semantics for image indexing and retrieval. In the 
supervised learning version, support vector detectors are ... 

Keywords: image classification, image indexing, image retrieval, pattern discovery, 
similarity matching 



37 Facial modeling and animation 
Jorg Haber, Demetri Terzopoulos 

August 2004 ACM SIGGRAPH 2004 Course Notes SIGGRAPH '04 
Publisher: ACM Press 

Full text available: ^ pdf(18.15 MB) Additional Information: full citation, abstra ct 

In this course we present an overview of the concepts and current techniques in facial 
modeling and animation. We introduce this research area by its history and applications. 
As a necessary prerequisite for facial modeling, data acquisition is discussed in detail. We 
describe basic concepts of facial animation and present different approaches including 
parametric models, performance-, physics-, and learning-based methods. State-of-the-art 
techniques such as muscle-based facial animation, mass-s ... 

38 Machine learning in automated text categorization 
Fabrizio Sebastian! 

March 2002 ACM Computing Surveys (CSUR)/ volume 34 issue l 
Publisher: ACM Press 

Full text available - "Kl pdf(524 41 KB) Additional Information: full citation, abstract, references, citings, index 
™ terms 

The automated categorization (or classification) of texts into predefined categories has 
witnessed a booming interest in the last 10 years, due to the increased availability of 
documents in digital form and the ensuing need to organize them. In the research 
community the domjnant approach to this problem is based on machine learning 
techniques: a general inductive process automatically builds a classifier by learning, from 
. a set of preclassified documents, the characteristics of the categories. ... 

Keywords: Machine learning, text categorization, text classification 
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39 Selected writings on computing : a personal perspective 
Edsger W. Dijkstra 
January 1982 Book 

Publisher: Springer-Verlag New York, Inc. 

Additional Information: full citation , abstract , references , cited by . index terms 

Since the summer of 1973, when I became a Burroughs Research Fellow, my life has been 
very different from what it had been before. The daily routine changed: instead of going to 
the University each day, where I used to spend most of my time in the company of others, 
I now went there only one day a week and was most of the time that is,, when not 
travelling!— alone in my study. In my solitude, mail and the written word in general 
became more and more important. The circumstance that my employe ... 

40 Cros s -lingual C*ST*RD: Eng lish access to Hindi information Q 
^ Anton Leuski, Chin-Yew Lin, Liang Zhou, Ulrich Germann, Franz Josef Och, Eduard Hovy 

V September 2003 ACM Transactions on Asian Language Information Processing 
(TALIP), Volume 2 Issue 3 
Publisher: ACM Press 

Full text available: pdf(210.61 KB) Additional Information: full citation , abstract , references , index terms 

We present C*ST*RD, a cross-language information delivery system that supports cross- 
language information retrieval, information space visualization and navigation, machine 
translation, and text summarization of single documents and clusters of documents. 
C*ST*RD was assembled and trained within 1 month, in the context of DARPA's Surprise 
Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given 
the brief time, we could not create deep Hindi capabilities for all th ... 

Keywords: Cross-language information retrieval, Hindi-to-English machine translation, 
headline generation, information retrieval and information space navigation, single- and 
multi-document text summarization 
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41 Dissemination of compressed historical information in sensor networks Q 
Antonios Deligiannakis, Yannis Kotidis, Nick Roussopoulos 

October 2007 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 16 Issue 4 
Publisher: Springer-Verlag New York, Inc. 
Additional Information: f ull citation , abstract 

Sensor nodes are small devices that "measure" their environment and communicate feeds 
of low-level data values to a base station for further processing and archiving. 
Dissemination of these multi-valued feeds is challenging because of the limited resources 
(processing, bandwidth, energy) available in the nodes of the network. In this paper, we 
first describe the SBR algorithm for compressing multi-valued feeds containing historical 
data from each sensor. The key to our technique is the . 
Keywords: Compression, Sensor Networks 

42 Appearance-based video clustering in 2D locality preserving projection s ubspace Q 
Li-Qun Xu, Bin Luo 

July 2007 Proceedings of the 6th ACM international conference on Image and video 
retrieval CIVR '07 

Publisher: ACM Press 

Full text available: ^] pdf(546.53 KB ) Additional Information: full citation , abstract, references , index terms 

In this paper we introduce an effective and unified approach to creating quality video 
abstractions. The research was motivated by a recently developed subspace learning 
method called 2D-LPP, or two-dimensional Locality Preserving Projection, which proved to 
be effective for dimensionality reduction and discriminating enough in 'appearance-based 1 
image recognitions. By exploiting temporal constraints (sequential correlations / 
contextual content) inherent in a video (vs. random collection of ... 

Keywords: 2D-PCA, 2D-locality preserving projection, data clustering, home video 
analysis, video browsing, vido abstraction 



43 Temporal event clustering for digital photo collections 

Matthew Cooper, Jonathan Foote, Andreas Girgensohn, Lynn Wilcox 
August 2005 ACM Transactions on Multimedia Computing, Communications, and 

Applications (TOMCCAP), Volume 1 Issue 3 
Publisher: ACM Press 

Additional Information: full citation, abstract, references , citings, index 
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Organizing digital photograph collections according to events such as holiday gatherings or 
vacations is a common practice among photographers. To support photographers in this 
task, we present similarity-based methods to cluster digital photos by time and image 
content. The approach is general and unsupervised, and makes minimal assumptions 
regarding the structure or statistics of the photo collection. We present several variants of 
an automatic unsupervised algorithm to partition a collection ... 

Keywords: Digital photo organization, digital libraries, temporal media indexing 



44 Automatic generation of "hyper-paths" in information retrieval systems: a stochastic Q 

<g> and an incremental algorithms 
^ Alain Lelu 

September 1991 Proceedings of the 14th annual international ACM SIGIR conference 
on Research and development in information retrieval SIGIR '91 

Publisher: ACM Press 

Full text available: Qpdf(981.55 KB) Additional Information; full citation , references , citing s, index term s 



45 E xtractin g predicates from mining models for ef fi cient query evaluation Q 

#Surajit Chaudhuri, Vivek IMarasayya, Sunita Sarawagi 
September 2004 ACM Transactions on Database Systems (TODS), Volume 29 issue 3 

Publisher: ACM Press 

Full text available: ^ pdf(698.37 KB) Additional Information: full citation, abstract, references , index terms 

Modern relational database systems are beginning to support ad hoc queries on mining 
models. In this article, we explore novel techniques for optimizing queries that contain 
predicates on the results of application of mining models to relational data. For such 
queries, we use the internal structure of the mining model to automatically derive 
traditional database predicates. We present algorithms for deriving such predicates for a 
large class of popular discrete mining models: decision trees, nai ... 

Keywords: Complex predicate optimization, simpler rules from complex predictive 
functions 



46 Special issue on 1CML: Coupled clustering: a method for detecti n g s tructural Q 
correspondence ' 

Zvika Marx, Ido Dagan, Joachim M. Buhmann, Eli Shamir 

March 2003 The Journal of Machine Learning Research, volume 3 

Publisher: MIT Press 

Full text available: ^pdf(967.15 KB) Additional Information: full citation , abstract , citings , index terms 

This paper proposes a new paradigm and a computational framework for revealing 
equivalencies (analogies) between sub-structures of distinct composite systems that are 
initially represented by unstructured data sets. For this purpose, we introduce and 
investigate a variant of traditional data clustering, termed coupled clustering, which 
outputs a configuration of corresponding subsets of two such representative sets. We 
apply our method to synthetic as well as textual data. Its achievement ... 

47 Smalltalk- 80: the lan guage and i ts im plementation Q 
Adele Goldberg, David Robson 

January 1983 Book 

Publisher: Addison-Wesley Longman Publishing Co., Inc. 

Full text available: ^| pdf(33.56 MB) Additional Information: full citation, abstract, cited by, index terms, review 
From the Preface (See Front Matter for full Preface) 

Advances in the design and production of computer hardware have brought many more 
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people into direct contact with computers. Similar advances in the design and production 
of computer software are required in order that this increased contact be as rewarding as 
possible. The Smalltalk-80 system is a result of a decade of research into creating 
computer software that is appropriate for producing highly functional and interactive ... 



Recreational compute r gra phics: Recreational computer graphics Q 
Andrew Glassner 

July 2006 ACM SIGGRAPH 2006 Courses SIGGRAPH '06 
Publisher: ACM Press 

Full text available: ^|pdf(13.82 MB) Additional Information: full citation , abstract , index terms 

Computer graphics isn't just a bunch of algorithms and programs: it's a gymnasium for the 
visual imagination, and a tool for investigating the world around us. Graphics can help us 
understand nature, invent new kinds of patterns and shapes, build up the clarity of our 
own mind's eye, and experiment with construction tools that would inspire even the most 
classical sculptors and painters. Going beyond tools and technique, this course invites 
attendees to think about using computer graphics in new ... 

49 Research track papers: Usin g hierarchical clustering for learning theontologies used Q 

in recommendation systems 
^ Vincent Schickel-Zuber, Boi Faltings 

August 2007 Proceedings of the 13th ACM SIGKDD international conference on 
Knowledge discovery and data mining KDD '07 

Publisher: ACM Press 

Full text available: ^ pdf(1.01 MB) Additional Information: full citation, abst ract , references, index terms 

Ontologies are being successfully used to overcome semanticheterogeneity, and are 
becoming fundamental elements of the SemanticWeb. Recently, it has also been shown 
that ontologies can be used tobuild more accurate and more personalized recommendation 
systems byinferencing missing user's preferences. However, these systemsassume the 
existence of ontologies, without considering theirconstruction. With product catalogs 
changing continuously, newtechniques are required in order to build these on ... 

Keywords: ontology, performance, recommendation systems 




50 Modeling and querying moving objects in networks 
Hartmut Guting, Teixeira de Almeida, Zhiming Ding 

June 2006 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 15 Issue 2 
Publisher: Springer-Verlag New York, Inc. 

Full text available: 'g) pdf(610.40 KB) Additional Information: full citation , abstract , references 

Moving objects databases have become an important research issue in recent years. For 
modeling and querying moving objects, there exists a comprehensive framework of 
abstract data types to describe objects moving freely in the 2D plane, providing data types 
such as moving point or moving region. However, in many applications people or vehicles 
move along transportation networks. It makes a lot of sense to model the network 
explicitly and to describe movements relative to the networ ... 

Keywords: ADT, Data type, Moving object, Network, Spatio-temporal 



51 Contributed articles: Genetic subtypinq using cluster analysis Q 
Tom Burr, James R. Gattiker, Greggory S. LaBerge 
July 2001 ACM SIGKDD Explorations Newsletter, Volume 3 issue l 

Publisher: ACM Press 

Full text available: pdf ( 984 . 40 KB ) Additional Information: full citation , abst ract, references 

In this paper we (1) describe state-of-the-art methods to identify clusters in DNA 
sequence data for taxonomic analysis; (2) describe a' new method with better scaling 
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properties based on model-based clustering, and (3) present examples using the 
nucleoproteln and hemagglutin regions of influenza and the env and gag regions of human 
immunodeficiency virus (HIV). 

Keywords: DNA sequence analysis, HIV, influenza, model-based clustering, phylogenetic 
trees 
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Han-Joon Kim, Sang-Goo Lee 

November 2000 Proceedings of the ninth international conference on Information and 
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Publisher: ACM Press 

Full text available: ^pdfl261.40 KB) Additional Information: full citation , references , citings, index terms 



Keywords: agglomerative hierarchical clustering, document clustering, fuzzy information 
retrieval, information organization, relevance feedback 



63 Discovering personally meaningful places: An interactive clustering approach 
Changqing Zhou, Dan Frankowski, Pamela Ludford, Shashi Shekhar, Loren Terveen 
July 2007 ACM Transactions on Information Systems (TOIS), Volume 25 issue 3 
Publisher: ACM Press 

Full text available: ^[ p d f (817.87 KB) Additional Information: full citation , abstract , referen ces, index terms 

The discovery of a person's meaningful places involves obtaining the physical locations and 
their labels for a person's places that matter to his daily life and routines. This problem is 
driven by the requirements from emerging location-aware applications, which allow a user 
to pose queries and obtain information in reference to places, for example, "home", "work" 
or "Northwest Health Club". It is a challenge to map from physical locations to personally 
mea ... 

Keywords: Ubiquitous computing, clustering algorithms, field studies, location-aware 
applications, place discovery 



54 Papers: Hierarchical land cover information retrieval in ob j ect-oriented remote 
^ sensing image databases with native queries 
^ Jiang Li 

March 2007 Proceedings of the 45th annual southeast regional conference ACM-SE 
45 

Publisher: ACM Press 

Full text available: "H pdf(958.87 KB) Additional Information: full citation , abstract , references , index terms 

Classification and change detection of land cover types In the remotely sensed images is 
one of the major applications in remote sensing. This paper presents a hierarchical 
framework for land cover information- storage and retrieval from object-oriented (OO) 
remote sensing image databases. Multi-spectral (band) remotely sensed images are 
classified by an optimized /c-means clustering algorithm. The land cover maps are then 
decomposed and indexed with region quad-tree data structure store ... 

Keywords: change detection, clustering, information retrieval, object-oriented databases, 
remote sensing 
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Full text available: ^pdf(12.45 MB) Additional Information: full citation , abstract , cited by . index terms 
The Gears of My Childhood 

Before I was two years old I had developed an intense involvement with automobiles. The 
names of car parts made up a very substantial portion of my vocabulary: I was particularly 
proud of knowing about the parts of the transmission system, the gearbox, and most 
especially the differential. It was, of course, many years later before I understood how 
gears work; but once I did, playing with gears became a favorite pastime. I loved rotating 
circular object ... 

56 Cumulating and sharing end users knowledge to improve video indexing in a video Q 
digital library 

Marc Nanard, Jocelyne Nanard 

January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 
libraries JCDL '01 

Publisher: ACM Press 

Full text available: f0 pdf(250.01 K B) Additional Information: full citatj.oo, abstract; references, citings, index . 

' terms 

In this paper, we focus on a user driven approach to improve video ind exing. It consists 
in cumulating the large amount of small, individual efforts done by the users who access 
information, and to provide a community management mechanism to let users share the 
elicited knowledge. This technique is currently being developed in the "OPALES" 
environment and tuned up at the "Institut National de I'Audiovisuel&r dquo;(INA), a 
National Video Library in Paris, to increase the v ... 

Keywords: knowledge sharing, private workspaces, users communities, video annotation, 
video indexing 



57 Document detection: TIPSTER phase I final report ^ 
Bill Caid, Stephen Gallant, Joel Carleton, David Sudbeck 

September 1993 Proceedings of a workshop on held at Fredericksburg, Virginia: 

September 19-23, 1993 
Publisher: Association for Computational Linguistics 
Full text available: pdf(184 MB) Additional Information: full citation , abstract 

During Phase I of the TIPSTER program, HNC developed a unique approach to machine 
learning of similarity of meaning. This approach, embodied in a system called "MatchPlus", 
exploits this learned similarity of meaning for concept-based text retrieval, routing and 
visualization of textual information. MatchPlus uses an information representation scheme 
called "context vectors" to encode similarity of usage. Key attributes of the context vector 
approach are as follows:* Words, documents, and q ... 

58 An architecture to support scalable online personalization on the Web Q 
Anindya Datta, Kaushik Dutta, Debra VanderMeer, Krithi Ramamritham, Shamkant B. 

Navathe 

August 2001 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 10 Issue 1 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ^g) pdfd 67,25 KB) Additional Information: full citation , abstract , citings , index terms 

Online personalization is of great interest to e-companies. Virtually all personalization 
technologies are based on the idea of storing as much historical customer session data as 
possible, and then querying the data store as customers navigate through a web site. The 
holy grail of online personalization is an environment where fine-grained, detailed 
historical session data can be queried based on current online navigation patterns for use 
in formulating real-time responses. Unfortunately, as mo ... 
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Keywords: Behavior-based personalization, Dynamic lookahead profile, Profile caching, 
Scalable online personalization, Web site and interaction model 
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Andrew S. Gordon, Eric A. Domeshek 

January 1998 Proceedings of the 3rd international conference on Intelligent user 

interfaces IUI '98 
Publisher: ACM Press 
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Keywords: digital libraries, retrieval interfaces, thesaurus browsing 



60 Natural lang uage processing: Sentence completion 
Korinna Grabski, Tobias Scheffer 

July 2004 Proceedings of the 27th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '04 

Publisher: ACM Press 

Full text available- ff| pdfri81.97 KB) Additional lnformation: MriWlQfl. abstract, references., citings, index 
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We discuss a retrieval model in which the task is to complete a sentence, given an initial 
fragment, and given an application specific document collection. This model is motivated 
by administrative and call center environments, in which users have to write documents 
with a certain repetitiveness. We formulate the problem setting and discuss appropriate 
performance metrics. We present an index-based retrieval algorithm and a cluster-based 
approach, and evaluate our algorithms using collections of ... 
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Figure 32: Scatter plot of the training data set for Aquatic toxicity, X=LogP, Y=eLUMO .... 
Euclidean Distance Index (k=x=2): ... 

ecb.jrc.cec.eu.int/documents/QSAR/QSAR_TOOLS/Toxmatch_user_manual. pdf - 
Similar pages 



http://ww.google.comte 10/1/2007 



^cluster + M training data" -fhistorical +distance -Hndex ^display - Google Search Page 2 of 2 

1 2 3 4 5 6 7 8 9 10 Next 
Download Google Pack : free essential software for your PC 

|+cluster vtraining [ data" +historical ^ 1 Search ] 
Search within results | Language Tools | Search Tips | Dissatisfied? Help us improve 

©2007 Google - Google Home - Advertising Programs - Business Solutions - About Google 



http://ww.googlexom/search?hl-en&q=%2Bcluster+%2B%22traim 10/1/2007 



+cluster + M training data" ^historical ^distance +index ^display - Google Search 



Page 1 of 2 



Web Images Video News Maps Gmail more * 



Sign in 



Google 



|+cluster ^'training data" +historical +distance • Search j 




New! View and manaae your web history 



Web Results 11 - 20 of about 20,000 for + cluster + " training data " ^- historical ^ distance +index ^ display . (0.0 

[pdf] Application of an Instance Based Learning Algorithm for Predicting ... 
instance. The nearest instance is the CD with the shortest Euclidean distance from, the 
training instance. The proportion of the training data set and the ... 

www.springerlink.com/index/n7m72257plvl63th.pdf - Similar pages 

tppti Building data cubes and mining them with DBMiner 
File Format: Microsoft Powerpoint - View as HTML 

DBMiner's Classification Module analyzes a set of training data (i.e. a set of ... The "closest 
cluster" is determined by the shortest distance from a point ... 

iis.fon.bg. ac.yu/SlajdoviVezbe/Building%20data%20cubes%20and%20mining%20them% 
20with%20DBMiner.ppt - Similar pages 

[pdf] Text Alignment with Handwritten Documents 

case we choose the top 30 occurring words in the. training data. 3. Take the next image, , 
to calculate its distance, from each cluster: Find ... 

doi. ieeecomputersociety. org/1 0. 1 109/DIAL2004. 1263249 - Similar pages 

[pdf] Microsoft PowerPoint - JPL 

File Format: PDF/Adobe Acrobat - View as HTML 

Index images using low-level features. Content-based image retrieval (CBIR): search 

pictures as Bird, car, food, historical buildings, and soccer game ... 

john.cs.olemiss.edu/-ychen/publications/talks/JPL.pdf - Similar pages 

[pdf] Sidewall structure estimation from CD-SEM for lithographic process ... 
File Format: PDF/Adobe Acrobat - View as HTML 

training data of known results taken from the same SEM setup and product ... to index a 
database through various means such as distance-based techniques, ... 
www.ornl.gov/sci/ismv/pdfs/publications/Sidewall%20structure%20estimation.pdf- 
Si milar pages 

Zulfiqar's web: Data Mining 

These cluster renaming will disappear as you drop/recreate your model/structure and 

that means that we have to look at our training data more closely. ... 
zulfiqar.typepad.com/zulfiqars_web/data_mining/index.html - 220k - Cached - Similar pages 

[pdf] Analysis of power transformer dissolved gas data using the self ... 
same cluster surrounded by the gray border can be defined as. one group. As an example, 
the u-matrix for Fig 5 (i.e., optimum. SOM for training data set D, ... 

ieeexplore.ieee.org/iel5/61/27676/01234676.pdf - Simil ar pages 

A multi-array multi-SNP genotyping algorithm for Affymetrix SNP ... 

SNPs result from single historical mutation events and, as nearby variants The smaller 

the within-cluster distance and the larger the between-cluster ... 

bioinformatics.oxfordjournals.org/cgi/content/full/23/12/1459 - S imila r pages 

[pdf] Enabling Analytical and Modeling for Enhanced Disease Surveillance 
File Format: PDF/Adobe Acrobat - View as HTML 

the reference distribution indicates a significant cluster of disease These data provide 

training data and the simulated anthrax release gives a ... 

www.osti.gov/bridge/servlets/purl/81 1 182-drSQFI/native/81 1 182.pdf - Similar pages 
GUI guide for data mining - Patent 6108004 

Winner: The index of the cluster which has the minimum Euclidean distance from the input 
record. Used in the Kohonen Feature Map to determine which output ... 



http://ww.googlexom/search?q=%2Bcluster+%2B%22training+data%22+%2Bh^ 10/1/2007 



^cluster +"training data" +historical +distance +index H-display - Google Search 
www.freepatentsonline.com/61 08004.html - 78k - Cached - Similar pages 



Page 2 of 2 



Previous 1 2 3 4 5 6 7 8 9 ion Next 



|+cluster ^'training data" +historical h S^arcfr j 
Search within results | Language Tools | Search Tips 

©2007 Google - Google Home - Advertising Programs - Business Solutions - About Google 



http://ww.googlexom/searc 10/1/2007 



+cluster +"training data" +historical +distance +index +display - Google Search 

Web Images Video News Maps Gmail more t 



Page 1 of 2 
Sign in 



G0 ° 9le j+cluster ^'training data" ^historical ^distance { j Search7| ffife^f arch 
[_ _ _ New! View and manage your web history 

Web Results 21 - 30 of about 20,000 for ^ cluster ^'' training data " ^ historical ^ distance + index -t- display (0,1 

[pdf] Ontology-Based Web Site Mapping for Information Exploration Abstract 
File Format: PDF/Adobe Acrobat - View as HTML 

site and identifies the cluster topics and topic relationships based on the .... the browsing 
structure are used as training data for categorizing the new ... 

www.ittc.ku.edu/publications/documents/Zhu1999_CIKM%2099.pdf - Similar pages 



[pdf] An application of the Self-Organizing Map and interactive 3-D ... 
File Format: PDF/Adobe Acrobat - View as HTML 

of the Harrisburg Historical Association and the Shipoke ... Figure 10: The 3-D distance 
map for 1990. 5.5 Temporal cluster analysis ... 

www.geocomputation.org/2001/papers/takatsuka!pdf - S i mjiar .Mg es 

rPDFi GEOCOMPUTATION TECHNIQUES IN SPATIAL DATA ANALYSIS: A SURVEY 
File Format: PDF/Adobe Acrobat - View as HTML 

cluster finding techniques. To better assess and understand the .... to indicate the different 
spatial regimes associated to the data and display in a ... 

www.dpi.inpe.br/geopro/trabalhos/rnsp_geocomp.pdf - &mijar„.fiages 

[pdf] Structural Health Monitoring 

such that points within a cluster are more .... distance, the above learning strategy is able 
to .....This gives a total of 2880 training data. The ... 

shm.sagepub.com/cgi/reprint/3/3/277.pdf - Similar_pages 

20070816 020037 4 227 sci arttext S01 02-31 1X2001000500002 scielo 1 ... 

GAM is basically a cluster finder for point or small-area data which are tolerant of some 

imprecision and have plenty of training data available. ... 

www.scielo.br/.. 7?lsisScript=ScieloXML/scLarttext.xis&def=scielo.def&pid=S0102- 
311X2001000500002 - 57k - Cached - Similar .pages 

[pdf] User adaptive handwriting recognition by self-growing ... 

training data is sufficient to initialize a model for the new user. We -means method 

adjusts the center of a cluster based on the. distance ... 

ieeexplore.ieee.org/iel5/72/19108/00883451.pdf - Similar pages 

[pdf] Temporal Sequence Learning and Data Reduction for Anomaly Detection 
File Format: PDF/Adobe Acrobat - View as HTML 

So the training data may display only one or two outside the cluster mean radius — 

i.e., points whose distance to the center ... 

www.es. unm.edu/~terran/downloads/pubs/acm_tissec99. pdf - Simijar.pages 

[pdf] Interface '99: A Data Mining Overview 
File Format: PDF/Adobe Acrobat - View as HTML 

demonstrated having idle computers join in an ad-hoc cluster with ... prediction accuracy 
averaged over many training data sets, it can ... 

www.sigkdd.org/explorations/issues/1-2-2000-01/goodman.pdf - Similar pages 

[ppt] Finding Who is Related to Whom— An Application of Shortest Path ... 
File Format: Microsoft Powerpoint - View as HTML 

Table: Distance matrix, the distance value shows the degree of disagreement between 
each pair of records in the training data set. ... 
ai.bpa.arizona.edu/hchen/docs/NIJ-DM-DC2002.ppt - Similar pages 

[pdf] A strategy for integrating product conceptualization and bid ... 

input sample x(m) from the training data of input samples and. compute the Euclidean 

http://ww.googlexom/sea^ 10/1/2007 



^cluster + n training data" +historical +distance +index ^display - Google Search / Page 2 of 2 

distance ... Step 3: Selecting the minimum distance. Find the index i ... 

www.ingentaconnect.com/content/klu/170/2006/00000029/00000005/00002552? 
crawler=true - Similar pages 



Previous 1 2 3 4 5 6 7 8 9 101112 Next 



|+cluster ^'training data" +historical h l Search | 
Search within results | Language Tools | Search Tips 



©2007 Google - Google Home - Advertising Programs - Business Solutions - About Google 



http://ww.googlexom/search?q=% 10/1/2007 



